AITopics | Wisconsin

Collaborating Authors

Wisconsin

Offline Actor-Critic for Average Reward MDPs

Neural Information Processing SystemsJun-22-2026, 23:37:33 GMT

We study offline policy optimization for infinite-horizon average-reward Markov decision processes (MDPs) with large or infinite state spaces. Specifically, we propose a pessimistic version of actor-critic methods using a computationally efficient linear function class for value function estimation. At the core of our method is a critic that computes a pessimistic estimate of the average reward under the current policy, as well as the corresponding policy gradient, by solving a fixedpoint Bellman equation, rather than solving a successive sequence of regression problems as in finite horizon settings. Due to the nature of our policy-based method, the critic only needs to solve a linear optimization problem with convex quadratic constraints. We show that a very mild data coverage requirement is sufficient for our algorithm to achieve O(ε 2) sample complexity for learning a near-optimal policy up to model misspecification errors. To our knowledge, this is the first result with optimal εdependence in the offline average reward setting.

artificial intelligence, machine learning, proceedings, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Optimal Mistake Bounds for Transductive Online Learning

Neural Information Processing SystemsJun-22-2026, 23:31:44 GMT

We resolve a 30-year-old open problem concerning the power of unlabeled data in online learning by tightly quantifying the gap between transductive and standard online learning. In the standard setting, the optimal mistake bound is characterized by the Littlestone dimension dof the concept class H(Littlestone, 1987). We prove that in the transductive setting, the mistake bound is at least Ω d . This constitutes an exponential improvement over previous lower bounds of Ω(loglog(d)), Ω p log(d), and Ω(log(d)), due respectively to Ben-David, Kushilevitz, and Mansour (1995, 1997), and Hanneke, Moran, and Shafer (2023). We also show that this lower bound is tight: for every d, there exists a class of Littlestone dimension d with transductive mistake bound O d . Our upper bound also improves upon the best known upper bound of (2/3) d from Ben-David et al. (1997). These results establish a quadratic gap between transductive and standard online learning, thereby highlighting the benefit of advance access to the unlabeled instance sequence. This contrasts with the PAC setting, where transductive and standard learning exhibit similar sample complexities.

adversary, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > Wisconsin (0.28)
North America > Canada > British Columbia (0.27)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning ATechnical Appendices and Supplementary Material1

Neural Information Processing SystemsJun-19-2026, 14:12:35 GMT

Besides, the115 coordinates are required to be normalized with image sizes and scaled to the range of [0,1000].116

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report (0.46)

Industry:

Education (0.67)
Banking & Finance (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning

Neural Information Processing SystemsJun-19-2026, 14:12:31 GMT

Scoring the Optical Character Recognition (OCR) capabilities of Large Multimodal Models (LMMs) has witnessed growing interest. Existing benchmarks have highlighted the impressive performance of LMMs in text recognition; however, their abilities in certain challenging tasks, such as text localization, handwritten content extraction, and logical reasoning, remain underexplored. To bridge this gap, we introduce OCRBench v2, a large-scale bilingual text-centric benchmark with currently the most comprehensive set of tasks (4 more tasks than the previous multi-scene benchmark OCRBench), the widest coverage of scenarios (31diverse scenarios), and thorough evaluation metrics, with 10,000human-verified questionanswering pairs and a high proportion of difficult samples. Moreover, we construct a private test set with 1,500 manually annotated images. The consistent evaluation trends observed across both public and private test sets validate the OCRBench v2's reliability. After carefully benchmarking state-of-the-art LMMs, we find that most LMMs score below 50 (100 in total) and suffer from five-type limitations, including less frequently encountered text recognition, fine-grained perception, layout perception, complex element parsing, and logical reasoning.

large language model, machine learning, pattern recognition, (21 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education (1.00)
Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.90)
(4 more...)

Add feedback

Global Minimizers of ℓp-Regularized Objectives Yield the Sparsest ReLU Neural Networks

Neural Information Processing SystemsJun-19-2026, 05:22:26 GMT

Overparameterized neural networks can interpolate a given dataset in many different ways, prompting the fundamental question: which among these solutions should we prefer, and what explicit regularization strategies will provably yield these solutions? This paper addresses the challenge of finding the sparsest interpolating ReLU network--i.e., the network with the fewest nonzero parameters or neurons--a goal with wide-ranging implications for efficiency, generalization, interpretability, theory, and model compression. Unlike post hoc pruning approaches, we propose a continuous, almost-everywhere differentiable training objective whose global minima are guaranteed to correspond to the sparsest singlehidden-layer ReLU networks that fit the data. This result marks a conceptual advance: it recasts the combinatorial problem of sparse interpolation as a smooth optimization task, potentially enabling the use of gradient-based training methods. Our objective is based on minimizing ℓp quasinorms of the weights for 0 < p < 1, a classical sparsity-promoting strategy in finite-dimensional settings. However, applying these ideas to neural networks presents new challenges: the function class is infinite-dimensional, and the weights are learned using a highly nonconvex objective. We prove that, under our formulation, global minimizers correspond exactly to sparsest solutions. Our work lays a foundation for understanding when and how continuous sparsity-inducing objectives can be leveraged to recover sparse networks through training.

artificial intelligence, machine learning, sout, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning

Neural Information Processing SystemsJun-17-2026, 03:19:12 GMT

Large Language Models (LLMs) are increasingly used to simulate human users in interactive settings such as therapy, education, and social role-play. While these simulations enable scalable training and evaluation of AI agents, off-the-shelf LLMs often drift from their assigned personas, contradict earlier statements, or abandon role-appropriate behavior. We introduce a unified framework for evaluating and improving persona consistency in LLM-generated dialogue. We define three automatic metrics--prompt-to-line consistency, line-to-line consistency, and Q&A consistency--that capture different types of persona drift and validate each against human annotations. Using these metrics as reward signals, we apply multiturn reinforcement learning to fine-tune LLMs for three user roles: a patient, a student, and a social chat partner. Our method reduces inconsistency by over 55%, resulting in more coherent, faithful, and trustworthy simulated users.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia (1.00)
North America > United States > Wisconsin (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.93)
Personal > Interview (0.92)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Government (1.00)
Education > Educational Setting > K-12 Education (1.00)
Health & Medicine > Consumer Health (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pioneering UK Nerve Lab harnesses AI to map effect of children's screen time

The GuardianJun-13-2026, 11:00:13 GMT

Tim Smith: 'Today's short-form, fast-paced, highly captivating content may affect children's attention, comprehension and emotional response'. Tim Smith: 'Today's short-form, fast-paced, highly captivating content may affect children's attention, comprehension and emotional response'. Pioneering UK Nerve Lab harnesses AI to map effect of children's screen time P arents are constantly being told to limit their children's screen time. A relatively slow-paced programme such as Bluey offers a very different viewing experience to a fast-moving action series such as PAW Patrol, yet both are broadly considered suitable for young children. This challenge is growing as the type of content children are exposed to evolves.

artificial intelligence, navigation close dialogue 1 5, social media, (12 more...)

The Guardian

Country:

Europe (0.48)
North America > United States > Wisconsin (0.15)

Industry:

Media (0.96)
Leisure & Entertainment > Sports (0.70)
Health & Medicine > Therapeutic Area > Neurology (0.32)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.74)

Add feedback

Inside soccer's data renaissance

MIT Technology ReviewJun-11-2026, 10:00:00 GMT

Many of the insights hitting soccer pitches today trace back to Jesse Davis and a team of computer scientists open-sourcing tools for some of the sport's trickiest problems. Imagine tuning in to the opening kickoff of a World Cup match and seeing a player intentionally send the ball all the way down the pitch and right out of bounds on the opponent's end. Casual fans might scratch their heads. If you were Jesse Davis, though, you'd know that this play could be a prime setup to score. Davis is a professor of computer science at KU Leuven in Belgium and head of its Sports Analytics Lab, which has been at the vanguard of a data awakening in soccer since its inception more than a decade ago. Though the research group brings machine-learning models to bear on a variety of sports--including basketball, volleyball, and field hockey--nowhere is its impact felt more than on the soccer pitch.

artificial intelligence, machine learning, social media, (13 more...)

MIT Technology Review

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.25)
North America > United States > Wisconsin (0.15)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Communications > Social Media (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Tree-Structured Orthonormal Decomposition of the Aitchison Simplex

Yamada, Daisuke, Zhang, Qijun, Pence, Travis, Bendlin, Barbara B., Rey, Federico, Singh, Vikas

arXiv.org Machine LearningJun-11-2026

Compositional data -- vectors encoding relative proportions -- arise across scientific domains, including ecology, geochemistry, and genomics. The features in these data often come with known hierarchical structure (e.g., taxonomies, phylogenies, ontologies), yet existing methods either ignore this structure, discard the intrinsic Aitchison geometry, are designed for binary trees, or yield incomplete coordinate systems. We describe PolyILR, a canonical orthonormal decomposition of the Aitchison tangent space aligned with any tree topology. Our construction defines a weighted local geometry at each internal node capturing full branching structure, then lifts these to a global orthonormal basis where every coordinate corresponds to a specific tree location. On microbiome and single-cell benchmarks, PolyILR yields stable, interpretable features and enables inference at multiscale tree resolution. We also establish a novel theoretical connection to softmax classifiers, suggesting possible applications to probabilistic modeling.

artificial intelligence, machine learning, tree-structured orthonormal decomposition, (17 more...)

arXiv.org Machine Learning

2606.11646

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

Vector Space of Cycles

Chung, Moo K., El-Yaagoubi, Anass B., Ombao, Hernando

arXiv.org Machine LearningJun-9-2026

Most statistical and machine learning methods for directed interactions focus on pairwise effects among variables. Even existing cyclic models represent feedback primarily through node-level dependencies, making large-scale recurrent organization difficult to estimate and compare. This limitation is particularly acute in biological and neural systems, where interactions are highly recurrent and involve many overlapping cycles. We introduce a variational framework for statistical inference on cyclic interactions. Directed interactions are represented as edge flows on a simplicial complex and evolved under an energy-minimizing dynamical system. The resulting dynamics separate transient interaction components from persistent harmonic flows, yielding a low-dimensional cycle space that captures stable recurrent organization. Rather than enumerating individual cycles, the proposed framework represents cyclic interactions as elements of a Hilbert space, enabling projection, averaging, comparison, and population-level statistical inference. We establish theoretical properties of the harmonic projection, including characterization of the cycle space, variance reduction, and population inference. Simulations demonstrate substantially improved recovery of cyclic structure in dense recurrent systems compared with existing directed-interaction methods. Applied to resting-state fMRI from 400 human subjects, the framework reveals reproducible large-scale cyclic organization that is not detectable through edgewise averaging. These results provide a scalable statistical framework for studying recurrent interactions in high-dimensional dynamical systems.

artificial intelligence, interaction, machine learning, (18 more...)

arXiv.org Machine Learning

2606.08202

Country: